Understanding the effect of Sampling Rate

Most signals in life are continuous: pressure waves propogating through air, chemical reactions, body movement. For computers to process these continuous signals, however, they must be converted to digital representations via a Analog-to-Digital Converter (ADC). One major way in which a digital signal is different from its continous counterpart is that it is sampled at specific time steps. For example, sound is often sampled at 44.1 kHz (or once every 0.023 milliseconds) and an accelerometer is often sampled at 100 Hz (once every 0.01 seconds).

In this example, we will use audio data is our primary signal. Sound is a wonderful medium for learning because we can hear the signal. Recall that a microphone responds to air pressure waves. We suggest plugging in your headphones, so you can really hear the distinctions in the various audio samples.

Dependencies

This notebook requires LibROSA—a python package for music and audio analysis. To install this package, you have two options.

First, from within Notebook, you can execute the following two lines within a cell (you'll only need to run this once):

import sys
!{sys.executable} -m pip install librosa

Second, from within your Anaconda shell:

> conda install -c conda-forge librosa

About this Notebook

This Notebook was designed and written by Professor Jon E. Froehlich at the University of Washington along with feedback from students. It is made available freely online as an open educational resource at the teaching website: https://makeabilitylab.github.io/physcomp/.

The notebook code is open source using the MIT license.

Main imports

Sampling

A continuous signal (in green) is sampled every $\frac{1}{T}$ Hz (in blue). From Wikipedia

A key factor in digitizing a signal is the rate at which the analog signal is sampled (or captured). How often must you sample a signal to perfectly reconstruct it?

Nyquist Sampling Theorem

The answer may surprise you and involves one of the most fundamental (and interesting) theorems in signal processing: the Nyquist Sampling Theorem, which states that a continuous signal can be reconstructed as long as there are two samples per period for the highest frequency component in the underlying signal.

That is, for a perfect reconstruction, our digital sampling frequency $f_s$ must be at least twice as fast as the fastest frequency in our continuous signal: $f_s = 2 * max(analog_{freq})$.

For example, imagine we have an analog signal composed of frequencies between 0 and 2,000 Hz. To properly digitize this signal, we must sample at $2 * 2,000Hz$. So, $f_s$ needs to be 4,000Hz.

Now imagine that the fastest our digitizer can sample is 6,000 Hz: what frequency range can we properly capture? Since we need a minimum of two samples per period for proper reconstruction, we can only signals that change with a frequency of 0 to a maximum of 3,000Hz. This 3,000Hz limit is called the Nyquist rate or Nyquist limit: it is $\frac{1}{2}$ the sampling rate $f_s$.

For many applications related to Human-Computer Interaction and Ubiquitous Computing, sampling at 4kHz is more than sufficient. This enables analysis of any signal between 0-2kHz. Human motion—ambulatory movement, limb motion, finger gestures, etc.—simply does not change that fast. Even electroencephalograms (EEG), which measure electrical activity in the brain, are often sampled at 500-1000Hz. However, for recording sound (humans can hear between ~0-20kHz), faster sampling rates are necessary.

Aliasing

What happens if we sample a signal with frequency components greater than the Nyquist limit ($> \frac{1}{2} * f_s$)? We get aliasing—a problem where the higher frequency components of a signal (those greater than the Nyquist limit) appear as lower frequency components. As Smith notes, "just as a criminal might take on an asumed named or identity (an alias), the sinusoid assumes another frequency that is not its own." And perhaps more nefariously, there is nothing in the sampled data to suggest that aliasing has occured: "the sine wave has hidden its true identity completely". See figure below.

Aliasing example

Let's look at an example. Here, we'll sample four signals at a sampling rate of 50Hz: signal1 = 5Hz, signal2 = 10Hz, signal3 = 20Hz, and signal4 = 60Hz. All but signal4 are under our Nyquist limit, which is $\frac{1}{2} * 50Hz = 25Hz$. What will happen?

The "samples" are shown as vertical lines with square rectangle markers. What do you observe? Pay close attention to signal4...

Let's take a closer look at signal2 = 10Hz and signal4 = 60Hz. To make it easier to see the sampled signal, we'll lighten the underlying continuous (real-world) signal in blue.

Can you see it? The 60Hz signal is being aliased as 10Hz. And once the signal is digitized (as it is here), there would be no way to tell the difference between an actual 10Hz signal and an aliased one!

Why?

Look at the graphs, at the first sample, both sinusoids are just beginning; however, by the next sample, the 60Hz sinusoid has almost completed one full period!

Given the aliasing formula, with a 50Hz sampling rate, 40Hz, 60Hz, 90Hz, 110Hz, 140Hz... will all be aliased to 10Hz. However, note that 40Hz and 60Hz (and 90Hz and 110Hz, and so on) will be aliased to 10Hz but with a phase shift of one-half period.

Similarly, 70Hz, 80Hz, 120Hz, and 130Hz will all be alised to 20Hz and 50Hz, 100Hz, 150Hz, etc. will all be aliased to zero.

Let's check it out:

Experimenting with the Nyquist limit

Let's keep experimenting with the Nyquist limit and aliasing but this time with sound data. Sound is a bit harder to visualize than our synthetic signals above because it's very high frequency (comparatively) but we'll provide zoomed insets to help.

We will also be using spectrogram visualizations to help us investigate the effect of lower sampling rates on the signal. A spectrogram plots the frequency components of our signal over time.

Frequency sweep from 0 - 22,050Hz

Let's start with a frequency sweep from 0 to 22,050Hz over the course of 30 sec period.

Sweep with sampling rate of 44,100 Hz

Same sweep but with a 11,025 Hz sampling rate

Now, imagine that we captured this frequency sweep with a $f_s$ of 11,025Hz. What will the sweep sound like now? What's the maximum frequency that we can capture with $f_s = 11,025Hz$?

Well, from the Nyquist limit, we know that with $f_s$, the maximum capturable frequency is $\frac{1}{2} * f_s$. Thus, the maximum frequency that we can capture is $\frac{1}{2} * 11,025 Hz = 5512.5 Hz$. Recall, however, that the underlying audio signal contains frequencies from 0 to 22,050Hz. So, what will happen with frequencies between 5,512.5 - 22,050Hz in our signal?

Yes, aliasing strikes again. Those higher frequency signals will be aliased to lower frequencies.

Let's see the problem in action below.

With a sampling rate of 11,025Hz, we see aliasing occur when the frequency sweep hits 5,512.5 Hz.

Playing a chord with frequencies from 261Hz to 4186Hz

Let's play a chord composed of the following frequencies: 261.626Hz, 293.665Hz, 391.995Hz, 2093Hz, and 4186.01Hz.

Given that the highest frequency in this signal is 4,186.01Hz, we need a minimum sampling rate of $2 * 4186.01Hz = 8372.02Hz$ to capture the highest frequency and $2 * 2093Hz = 4,186Hz$ to capture the next highest frequency.

Sampled at 44,100Hz

The original sampling was at 44,100Hz, which is more than sufficient to capture the sound signal.

Sampled at 11,025 Hz

At a sampling rate of 11,025Hz, the Nyquist limit (5,512Hz) is still above the maximum frequency in our signal (4186.01 Hz), so the audio should sound the exact same as it did for the original 44,100 Hz sample above (and no aliasing will occur).

Sampled at 4,410Hz

What about if the sampling rate $f_s$ is 4,410Hz? Then the Nyquist limit is 2,205Hz, which is below the 4186.01 Hz signal.

Recall our aliased frequency formula: so what will 4186.01Hz show up as in our sampled signal? It will be aliased as 223.99Hz.

Let's check to see what happens.

Example: how do sampling rates affect sound quality?

Below, we downsample a 44.1 kHz human voice to: 22.5kHz, 11,025Hz ... 441 Hz. For each downsampling, we visualize the original 44.1 kHz waveform as well as its downsampled counterpart.

22500 Hz sampling rate

11,025 Hz sampling rate

4,410 Hz sampling rate